27 research outputs found

    Exploring the Applicability of Low‑Shot Learning in Mining Software Repositories

    Get PDF
    Background: Despite the well-documented and numerous recent successes of deep learning, the application of standard deep architectures to many classification problems within empirical software engineering remains problematic due to the large volumes of labeled data required for training. Here we make the argument that, for some problems, this hurdle can be overcome by taking advantage of low-shot learning in combination with simpler deep architectures that reduce the total number of parameters that need to be learned. Findings: We apply low-shot learning to the task of classifying UML class and sequence diagrams from Github, and demonstrate that surprisingly good performance can be achieved by using only tens or hundreds of examples for each category when paired with an appropriate architecture. Using a large, off-the-shelf architecture, on the other hand, doesn’t perform beyond random guessing even when trained on thousands of samples. Conclusion: Our findings suggest that identifying problems within empirical software engineering that lend themselves to low-shot learning could accelerate the adoption of deep learning algorithms within the empirical software engineering community

    Evaluation of Spatial Generalization Characteristics of a Robust Classifier as Applied to Coral Reef Habitats in Remote Islands of the Pacific Ocean

    Get PDF
    This study was an evaluation of the spectral signature generalization properties of coral across four remote Pacific Ocean reefs. The sites under consideration have not been the subject of previous studies for coral classification using remote sensing data. Previous research regarding using remote sensing to identify reefs has been limited to in-situ assessment, with some researchers also performing temporal analysis of a selected area of interest. This study expanded the previous in-situ analyses by evaluating the ability of a basic predictor, Linear Discriminant Analysis (LDA), trained on Depth Invariant Indices calculated from the spectral signature of coral in one location to generalize to other locations, both within the same scene and in other scenes. Three Landsat 8 scenes were selected and masked for null, land, and obstructed pixels, and corrections for sun glint and atmospheric interference were applied. Depth Invariant Indices (DII) were then calculated according to the method of Lyzenga and an LDA classifier trained on ground truth data from a single scene. The resulting LDA classifier was then applied to other locations and the coral classification accuracy evaluated. When applied to ground truth data from the Palmyra Atoll location in scene path/row 065/056, the initial model achieved an accuracy of 80.3%. However, when applied to ground truth observations from another location within the scene, namely, Kingman Reef, it achieved an accuracy of 78.6%. The model was then applied to two additional scenes (Howland Island and Baker Island Atoll), which yielded an accuracy of 69.2% and 71.4%, respectively. Finally, the algorithm was retrained using data gathered from all four sites, which produced an overall accuracy of 74.1%

    Coral Reef Change Detection in Remote Pacific Islands Using Support Vector Machine Classifiers

    Get PDF
    Despite the abundance of research on coral reef change detection, few studies have been conducted to assess the spatial generalization principles of a live coral cover classifier trained using remote sensing data from multiple locations. The aim of this study is to develop a machine learning classifier for coral dominated benthic cover-type class (CDBCTC) based on ground truth observations and Landsat images, evaluate the performance of this classifier when tested against new data, then deploy the classifier to perform CDBCTC change analysis of multiple locations. The proposed framework includes image calibration, support vector machine (SVM) training and tuning, statistical assessment of model accuracy, and temporal pixel-based image dierencing. Validation of the methodology was performed by cross-validation and train/test split using ground truth observations of benthic cover from four dierent reefs. These four locations (Palmyra Atoll, Kingman Reef, Baker Island Atoll, and Howland Island) as well as two additional locations (Kiritimati Island and Tabuaeran Island) were then evaluated for CDBCTC change detection. The in-situ training accuracy against ground truth observations for Palmyra Atoll, Kingman Reef, Baker Island Atoll, and Howland Island were 87.9%, 85.7%, 69.2%, and 82.1% respectively. The classifier attained generalized accuracy scores of 78.8%, 81.0%, 65.4%, and 67.9% for the respective locations when trained using ground truth observations from neighboring reefs and tested against the local ground truth observations of each reef. The classifier was trained using the consolidated ground truth data of all four sites and attained a cross-validated accuracy of 75.3%. The CDBCTC change detection analysis showed a decrease in CDBCTC of 32% at Palmyra Atoll, 25% at Kingman Reef, 40% at Baker Island Atoll, 25% at Howland Island, 35% at Tabuaeran Island, and 43% at Kiritimati Island. This research establishes a methodology for developing a robust classifier and the associated Controlled Parameter Cross-Validation (CPCV) process for evaluating how well the model will generalize to new data. It is an important step for improving the scientific understanding of temporal change within coral reefs around the globe

    A Large-Scale Sentiment Analysis of Tweets Pertaining to the 2020 US Presidential Election

    Get PDF
    We capture the public sentiment towards candidates in the 2020 US Presidential Elections, by analyzing 7.6 million tweets sent out between October 31st and November 9th, 2020. We apply a novel approach to first identify tweets and user accounts in our database that were later deleted or suspended from Twitter. This approach allows us to observe the sentiment held for each presidential candidate across various groups of users and tweets: accessible tweets and accounts, deleted tweets and accounts, and suspended or inaccessible tweets and accounts. We compare the sentiment scores calculated for these groups and provide key insights into the differences. Most notably, we show that deleted tweets, posted after the Election Day, were more favorable to Joe Biden, and the ones posted leading to the Election Day, were more positive about Donald Trump. Also, the older a Twitter account was, the more positive tweets it would post about Joe Biden. The aim of this study is to highlight the importance of conducting sentiment analysis on all posts captured in real time, including those that are now inaccessible, in determining the true sentiments of the opinions around the time of an event

    Trends in Opioid Use in Pediatric Patients in US Emergency Departments From 2006 to 2015

    Get PDF
    Importance The use of opioids to treat pain in pediatric patients has been viewed as necessary; however, this practice has raised concerns regarding opioid abuse and the effects of opioid use. To effectively adjust policy regarding opioids in the pediatric population, prescribing patterns must be better understood. Objective To evaluate opioid prescribing patterns in US pediatric patients and factors associated with opioid prescribing. Design, Setting, and Participants This cross-sectional study used publicly available data from the National Hospital Ambulatory Medical Care Survey from January 1, 2006, to December 31, 2015. Analysis included the use of bivariate and multivariate models to evaluate factors associated with opioid prescribing. Practitioners from emergency departments throughout the United States were surveyed, and data were collected using a representative sample of visits to hospital emergency departments. The study analyzed all emergency department visits included in the National Hospital Ambulatory Medical Care Survey for patients younger than 18 years. All statistical analysis was completed in June of 2018 and updated upon receiving reviewer feedback in October of 2018. Exposures Information regarding participants’ medications was collected at time of visit. Participants who reported taking 1 or more opioids were identified. Main Outcomes and Measures Evaluation of opioid prescribing patterns across demographic factors and pain diagnoses. Results A total of 69 152 visits with patients younger than 18 years (32 727 female) were included, which were extrapolated by the National Hospital Ambulatory Medical Care Survey to represent 293 528 632 visits nationwide, with opioid use representing 21 276 831 (7.25%) of the extrapolated visits. Factors including geographic region, race, age, and payment method were associated with statistically significant differences in opioid prescribing. The Northeast reported an opioid prescribing rate of 4.69% (95% CI, 3.69%-5.70%) vs 8.84% (95% CI, 6.82%-10.86%) in the West (P = .004). White individuals were prescribed an opioid at 8.11% (95% CI, 7.23%-8.99%) of visits vs 5.31% (95% CI, 4.31%-6.32%) for nonwhite individuals (P \u3c .001). Those aged 13 to 17 years were significantly more likely to receive opioid prescriptions (16.20%; 95% CI, 14.29%-18.12%) than those aged 3 to 12 years (6.59%; 95% CI, 5.75%-7.43%) or 0 to 2 years (1.70%; 95% CI, 1.42%-1.98%). Patients using Medicaid for payment were less likely to receive an opioid than those using private insurance (5.47%; 95% CI, 4.79%-6.15% vs 9.73%; 95% CI, 8.56%-10.90%). There was no significant difference in opioid prescription across sexes. Opioid prescribing rates decreased when comparing 2006 to 2010 with 2011 to 2015 (8.23% [95% CI, 6.75%-9.70%] vs 6.30% [95% CI, 5.44%-7.17%]; P \u3c .001); however, opioid prescribing rates remained unchanged in specific pain diagnoses, including pelvic and back pain. Conclusions and Relevance This research demonstrated an overall reduction in opioid use among pediatric patients from 2011 to 2015 compared with the previous 5 years; however, there appear to be variations in factors associated with opioid prescribing. The association of location, race, payment method, and pain diagnoses with rates of prescribing of opioids suggests areas of potential quality improvement and further research

    A Quantitative Validation of Multi-Modal Image Fusion and Segmentation for Object Detection and Tracking

    Get PDF
    In previous works, we have shown the efficacy of using Deep Belief Networks, paired with clustering, to identify distinct classes of objects within remotely sensed data via cluster analysis and qualitative analysis of the output data in comparison with reference data. In this paper, we quantitatively validate the methodology against datasets currently being generated and used within the remote sensing community, as well as show the capabilities and benefits of the data fusion methodologies used. The experiments run take the output of our unsupervised fusion and segmentation methodology and map them to various labeled datasets at different levels of global coverage and granularity in order to test our models’ capabilities to represent structure at finer and broader scales, using many different kinds of instrumentation, that can be fused when applicable. In all cases tested, our models show a strong ability to segment the objects within input scenes, use multiple datasets fused together where appropriate to improve results, and, at times, outperform the pre-existing datasets. The success here will allow this methodology to be used within use concrete cases and become the basis for future dynamic object tracking across datasets from various remote sensing instruments

    Leveling the Playing Field: Supporting Neurodiversity via Virtual Realities

    Get PDF
    Neurodiversity is a term that encapsulates the diverse expression of human neurology. By thinking in broad terms about neurological development, we can become focused on delivering a diverse set of design features to meet the needs of the human condition. In this work, we move toward developing virtual environments that support variations in sensory processing. If we understand that people have differences in sensory perception that result in their own unique sensory traits, many of which are clustered by diagnostic labels such as Autism Spectrum Disorder (ASD), Sensory Processing Disorder, Attention-Deficit/Hyperactivity Disorder, Rett syndrome, dyslexia, and so on, then we can leverage that knowledge to create new input modalities for accessible and assistive technologies. In an effort to translate differences in sensory perception into new variations of input modalities, we focus this work on ASD. ASD has been characterized by a complex sensory signature that can impact social, cognitive, and communication skills. By providing assistance for these diverse sensory perceptual abilities, we create an opportunity to improve the interactions people have with technology and the world. In this paper, we describe, through a variety of examples, the ways to address sensory differences to support neurologically diverse individuals by leveraging advances in virtual reality

    Paper Prototyping Comfortable VR Play for Diverse Sensory Needs

    Get PDF
    We co-designed paper prototype dashboards for virtual environments for three children with diverse sensory needs. Our goal was to determine individual interaction styles in order to enable comfortable and inclusive play. As a first step towards an inclusive virtual world, we began with designing for three sensory-diverse children who have labels of neurotypical, ADHD, and autism respectively. We focused on their leisure interests and their individual sensory profiles. We present the results of co-design with family members and paper prototyping sessions conducted by family members with the children. The results contribute preliminary empirical findings for accommodating different levels of engagement and empowering users to adjust environmental thresholds through interaction design

    A Cluster Analysis of Challenging Behaviors in Autism Spectrum Disorder

    Get PDF
    We apply cluster analysis to a sample of 2,116 children with Autism Spectrum Disorder in order to identify patterns of challenging behaviors observed in home and centerbased clinical settings. The largest study of this type to date, and the first to employ machine learning, our results indicate that while the presence of multiple challenging behaviors is common, in most cases a dominant behavior emerges. Furthermore, the trend is also observed when we train our cluster models on the male and female samples separately. This work provides a basis for future studies to understand the relationship of challenging behavior profiles to learning outcomes, with the ultimate goal of providing personalized therapeutic interventions with maximum efficacy and minimum time and cost

    A Program Evaluation of Home and Center-Based Treatment for Autism Spectrum Disorder

    Get PDF
    The present study aimed to retrospectively compare the relative rates of mastery of exemplars for individuals with ASD (N = 313) who received home-based and center-based services. A between-group analysis found that participants mastered significantly more exemplars per hour when receiving center-based services than home-based services. Likewise, a paired-sample analysis found that participants who received both home and center-based services had mastered 100 % more per hour while at the center than at home. These analyses indicated that participants demonstrated higher rates of learning during treatment that was provided in a center setting than in the participant’s home
    corecore